Estimating Recombination Rate Distribution by Optimal Quantization
نویسندگان
چکیده
We obtain recombination rate distribution functions for all human chromosomes using an optimal quantization method. This non-parametric method allows us to control over-/under-fitting. The piece-wise constant recombination rate distribution functions are convenient to store and retrieve. Our experimental results showed more abrupt distribution functions than two recently published results. In the previous results, the over-/under-fitting issues were not addressed explicitly. Our estimation had greater log likelihood over a previous result using Parzen window. It suggests that the optimal quantization technique might be of great advantage for estimation of other genomic feature dis-
منابع مشابه
Estimate Recombination Rate Distribution by Optimal Quantization
Evolution biologists are interested in a high resolution recombination map that depicts accurately how often a recombination event occurs at a specific location in the genome. With the availability of human genome physical map and fast sequencing technology, people start to estimate recombination rate distributions. We obtain recombination rate distribution functions for all the chromosomes in ...
متن کاملOptimal quantizers for probability distributions on nonhomogeneous R-triangles
Quantization of a probability distribution refers to the idea of estimating a given probability by a discrete probability supported by a finite set. In this paper, we have considered a Borel probability measure P on R which has support the R-triangle generated by a set of three contractive similarity mappings on R. For this probability measure, the optimal sets of n-means and the nth quantizati...
متن کاملAsymptotically Optimal Distribution Preserving Quantization for Stationary Gaussian Processes
Distribution preserving quantization (DPQ) has been proposed as a lossy coding tool that yields superior quality over conventional quantization, when applied to perceptually relevant signals. DPQ aims at the optimal rate-distortion trade-off, subject to preserving the source probability distribution. In this article we investigate the optimal DPQ for stationary Gaussian processes and the mean s...
متن کاملThe Time Adaptive Self Organizing Map for Distribution Estimation
The feature map represented by the set of weight vectors of the basic SOM (Self-Organizing Map) provides a good approximation to the input space from which the sample vectors come. But the timedecreasing learning rate and neighborhood function of the basic SOM algorithm reduce its capability to adapt weights for a varied environment. In dealing with non-stationary input distributions and changi...
متن کاملTwo-Locus Likelihoods Under Variable Population Size and Fine-Scale Recombination Rate Estimation.
Two-locus sampling probabilities have played a central role in devising an efficient composite-likelihood method for estimating fine-scale recombination rates. Due to mathematical and computational challenges, these sampling probabilities are typically computed under the unrealistic assumption of a constant population size, and simulation studies have shown that resulting recombination rate est...
متن کامل